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ABSTRACT 



To improve computer performance, problems of emulation 
such as WAR hazard, uneven utilization of machine 
resources, unnecessary dependencies, wasted hardware 
resources and data buffer pollution, are alleviated by 
responding to dynamic execution information, such as 
branch prediction, register usage, overflow, a history of 
branch predictions of groups of branches combined, and a 
history of register usage for: dynamically modifying instruc- 
tion parameters of an emulation sequence of instructions; 
reordering emulated instructions; and adding or changing 
the dynamic execution information. 

8 Claims, 3 Drawing Sheets 
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EXECUTION TIME MODIFICATION OF sequence. Such modifying is in response to dynamic execu- 

INSTRUCTION EMULATION PARAMETERS tion information obtained by executing an emulated 

sequence of instructions. The dynamic execution informa- 

BACKGROUND OF THE INVENTION t ion may be provided by a module, like a branch predictor, 

_ 5 with a prediction modified by the present invention. 

The present invention relates to emulation of instructions ™ - , . , J 

for execution by an instruction processor, in a computer ™ e . pn0r m fixed nature of a hardware translation 

environment. emulation sequence is identified by the inventor as causing 

An instruction, in digital computer operations, is a set of m uneven utilization of machine resources. For example, 
bits defining an operation. The instruction may comprise an 10 some out of manv temporary registers are repeatedly used 
operation code specifying the operation to be performed, while others m not used - T) ^ s uneven utilization has been 
one or more operands or their addresses, and one or more analyzed and found to lead to unnecessary dependencies that 
modifiers or their addresses (to modify the operand or its affect performance. For example when one register is repeat- 
address). An instruction set, also called an instruction code, edly used for temporary values due to emulation, the instruc- 
comprises symbols and characters that compose the syntax 15 tions that depend upon those values must be executed before 
of a computer programming language, and in a computer's the register value is changed and thus the use of out-of-order 
basic machine code, the part that specifies how characters or machines and parallel or multi processing use is limited, 
digits are used to represent the codes within the machine's Stall or Write- After-Read (WAR) hazard may occur. Resolv- 
lnstruction set. ing such dependencies according to the prior art can poten- 

Processors often emulate instructions, so that a first com- 20 tially waste hardware resources, for example a register 

puter system may behave in the same manner as a second renaming resource. Also such dependencies of the emulated 

computer system, for instructions that are not directly imple- sequence may require that multiprocessing, out-of-order 

mented in the first system. Examples of such emulation processing, etc. not be used 

include 1) running Java byte codes on a general purpose ~ . 

computer, e.g. so that a general purpose computer can run 25 Conventlonal emulation sequence generation can have 

Java software written for another machine, to provide a Java undesirable side affects like polluting data buffers. For 

virtual machine, 2) supporting instructions of a different example, when a data buffer is rewritten due to emulation, 

instruction set architecture for compatibility reasons, and 3) me data buffer is P olluted if the thus destroyed value is 

operating a microprocessor as a terminal of a network in needed in a subsequent instruction, whose execution will 

order to communicate with mainframes. Emulation includes 30 thereby try to use a value from the data buffer that has been 

a computer, device, program or combination thereof imitat- displaced. 

ing the function of another computer, device, program or This invention alleviates some of the above-mentioned 

combination thereof. The emulation may be done in hard- problems by dynamically (i.e. in response to dynamic execu- 

ware or firmware or software or some combination thereof, tion information generated during the emulation or execu- 

such hardware or software or firmware or combination 35 tion of the emulated sequence of instructions, as opposed to 

thereof being an emulator. at ^ time of ^ chitGCtme desi , modifying ±e emulation 

hardware is considerably faster than emulation through * ™S or ™*u'- 

software. When hardware emulation is used, the instruction 40 *° n & * h&d f n ? or Wlth overflow or changing * 

that is being emulated is often 'translated" into an instruc- branch predicuon or allocatmg temporary registers from a 

tion emulation sequence of one or more instructions in the p ° o1 of re S lsters ' Such modification may include the chang- 

native instruction set of the CPU being used (such native mg of me sec l uence or scheduling of instructions or instruc- 

instruction set comprising the instructions that have been tion clu sters, changing the order of clusters, and changing 

implemented) and this translated sequence of native instruc- 45 ^ se Q ue nce of parameters. The cost for implementing the 

tions is then executed. The translation sequence, that is the dynamic modifying of parameters is reasonable and the 

instruction emulation sequence, is fixed (as opposed to benefit is commensurate with the design support, 

dynamic), as the details of emulation for each instruction The embodiment describes the use of dynamic execution 

that needs to be emulated is known at design time. information to generate improved and optimal instruction 

Translation of an instruction from one language to another 50 emulation sequences. Examples of such dynamic execution 

is performed by compilers, assemblers and interpreters, for information are dynamic branch prediction information, 

example. overflow and temporary register allocation. The dynamic 

execution information may come from an historical state of 

SUMMARY OF THE INVENTION the resources such as registers as to their use and cycling, or 

55 from a branch prediction state machine that keeps a long 

The present invention analyzes problems, identifies and history of branches at particular addresses and a long history 

analyzes causes of the problems, and provides solutions to of branches taken and not taken to set likely and unlikely 

the problems. This analysis of the problems, the identifica- flags or condition codes for each of the branches or in 

tion and analysis of the causes, and the provision of solu- consideration of the flags of a group of branches, or from a 

tions are each a part of the present invention and will be set 60 resource usage that determines overflow error, 

forth below. A branch prediction guesses whether a branch will be 

The invention identifies and analyzes problems of emu- taken in a program and fetches code accordingly. When a 

lation such as uneven utilization of machine resources, branch is taken, me next instruction of the branch sequence 

unnecessary dependencies, wasted hardware resources, is stored in fast memory, such as a cache, and the "next 

overflow, and data buffer pollution. The invention provides 65 instruction" is therefore ready to be removed from such fast 

the solution of dynamically modifying the emulation storage to be used the next time a branch, which may or may 

sequence, particularly the parameters of instructions in the not be the same branch, is encountered, to thereby predict 
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which way the instruction wUl branch, which prediction is FIG. 4 is a flow chart of the method of generating an 

correct about 90% of the time. historic register usage table used as a usage resource of 

U.S. Pat. No. 6,115,809 discloses branch prediction for dynamic execution information with the emulator of FIG. 3. 
separate caching of instructions according to a classification 

of either strong or weak likelihood of branching, for either 5 DETAILED DESCRIPTION 
fixed or dynamic prediction. Although this patent does not 

relate to emulation, this patent could be used to predict The invention dynamically changes at least a component 

which path will be taken among four possible paths defined 0 f a computer system that generates an emulated sequence 

by two successive branches, for example, and provide 0 f instructions, to improve performance, 

multiple condition codes for branching, prediction flags in 10 Stm other aspects> features and advantages of the present 

the patent. The disclosure is incorporated herein for an inve ntion are set forth in the following detailed description, 

implementation of a branch predictor. of a partiC ular embodiment, including the best mode con- 

Until an operation is completed, in the pnor art, a param- temp iated for carrying out the present invention, along with 

eter is effectively treated as a constant value by the program. specific examples 0 f sequences of instructions. The present 

The embodiment dynamically changes the parameters m the 15 ^nilon is capable of implementation in other and different 

emulation sequence during emulation, to improve perfor- embodiments, and its details can be modified in various 

mance. A parameter, as an example of a dynamic execution reS pects, all without departing from the spirit and scope of 

information, is a value that is given to a variable, for the present mven tion. Accordingly, the drawing and descrip- 

example at the beginning of an operation or before an tion m t0 be regarded ^ illustrative in nature, and not as 

expression is evaluated. A parameter can be text, a number, 20 restr i ct i ve 

or an argument name assigned to a value that is passed from AU . u . u . ... . , ., , . c 

* „u c i c u * Although, this embodiment is described using a specific 

one routine to another. Examples of such parameters are , r a . . . f * 

• . a u • . ^ r. * * • known microprocessor instruction set as input to the emu- 
register fields within instructions. Parameters can customize , . . c , 
5 . lator and a specific known different microprocessor mstruc- 

program opera on - ^on set ^ ou tput, the invention may be used in other 

In certain cases, dynamic execution information and 25 . r J 

, . i . j . environments, 

dynamic modifying of parameters is used to overcome . 

hardware restrictions. An example is the use of multiple A computer system, emulation method, computer read- 

condition codes, even though the architecture does not able medlum Wlth f ° r emulation, and an emulator for 

provide for this. A condition code is one of a set of bits that generating an emulated sequence of instructions are 

are set as the result of previous machine instructions, and 30 described, for the purposes of explanation, with specific 

they are hardware-specific. Condition codes include carry, details > m order to provide a thorough ^ understanding of the 

overflow, zero result, and negative result code. A particular P resent invention. However, one skilled in the art may 

condition code may produce a conditional branch, a condi- P ractice Ae present invention without these specific details 

tional jump or a conditional transfer, for example. or Wlth equivalents. Well-known structures and devices are 

In the embodiments, execution may be actual or virtual to 35 shown m block dia g fam formm order t0 avoid unnecessarily 

provide the dynamic execution information. For example, obscuring the present invention. 

virtual execution may be on a virtual machine (Java code ^G. 2 illustrates a computer system 100 as an embodi- 

using a sandbox, e.g.) where there is no access to the file m ent according to the present invention. A computer (for 

system of a computer or computers on which they are example a micro-, mini-, super-, super scalar-, multi-and 

executing. Further, examples of emulated sequences of 40 out-of-order-processor) 101 includes: a bus 102 communi- 

instructions include not only being in the instruction set of eating information among one or more processors 103 (e.g. 

the executing computer but also runable with such instruc- a CPU ) 311(1 R0M 113 that stores static information and 

tion set, for example when in a cross-platform programing instructions for the processor 103; main memory or storage 

language such as Java Thus emulation could translate from 104 > such as a random access memory (RAM) or other 

or into Java, for example. 45 dynamic storage device, coupled to the bus 102 for storing 

information and instructions to be executed by the processor 

BRIEF DESCRIPTION OF THE DRAWING 103; and one or more cache memories 105, which may be on 

a single chip with one or more of the processors 103 and/or 

The present invention is illustrated by way of embodi- coupled with a processor by the bus 102. When the computer 

ments and examples in the figures of the accompanying 50 system 100 has more than one of the processors 103, the 

drawings and in which like reference numerals refer to computer may be referred to as a multiprocessor or a 

similar elements. Further objects, features and advantages of computer with superscalar architecture. The main memory 

the present invention will become more clear from the or storage 104 and one or more cache memories 105 are used 

following detailed description of a preferred embodiment for storing temporary variables in registers Rn and tempo- 

and best mode of implementing the invention, as shown in 55 rary registers TR0, or for storing other intermediate infor- 

the drawing, wherein: mation during execution of instructions and emulation by 

FIG. 1 is a schematic of a hardware and/or software the processor/s 103. The main memory 104 is used for 

and/or firmware emulator of the present invention, with an storing the program or code to control operation of and be 

input sequence of instructions and an output emulated a part of the emulator 106, or the emulator 106 may be 

sequence of instructions; 60 firmware in the read only memory ROM 113. The emulator 

FIG. 2 shows a computer system using the emulator of 106 may be hardware on a card or a board. 

FIG. 1 in combination to implement an embodiment of the A magnetic disk or optical disk or other type of peripheral 

present invention; storage 107, having computer readable media is coupled to 

FIG. 3 is a flow chart of the method of operation of the the computer 101. A display 108 such as a cathode ray tube 

emulator of FIG. 1 operating in the computer system of FIG. 65 (CRT) or liquid crystal display (LCD) or plasma display, an 

2, applicable to hardware and/or software and/or firmware input device 109 such as a keyboard and/or mouse, and any 

implementation; and other input 110 are coupled to the computer 101. 

NOV 9 200T 
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A general purpose input/output port (I/O) 111 couples the 
computer 101 with other structure, for example with the 
network 112, which is a LAN, WAN, WWW, or the Internet, 
or the like, to which is coupled another similar computer 
system 300, so that the computer system 100 may emulate 5 
the instruction set of the computer system 300, or vice versa 
An original instruction sequence to be emulated is read into 
main memory 104, for example, from another computer 
system 300 or from a computer readable medium, such as 10 
storage 107. Thus the computer system for emulation may 
be local or distributed. 

The execution may be for an end use (preferred embodi- 
ment) or only to produce an emulated sequence of instruc- 
tions that is then stored for subsequent end use execution. In 
the preferred embodiment, emulation is provided by the 
computer system 100 during execution of an original 
sequence of instructions, that is, emulation and execution are 
effectively being conducted on a real-time or run-time basis, 20 
or substantially simultaneously. The execution and emula- 
tion may be in different computer systems or conducted with 
different processors in the same computer system or con- 
ducted on a single processor. The execution of the original 
and emulated sequence of instructions produces dynamic 
execution information from software or resource usage or 
external hardware/software, which information is dynami- 
cally used by the emulator in generating or modifying the 
emulated sequence of instructions, as will be described 30 
below with respect to FIG. 1 in more detail. This information 
can be stored in temporary or hidden registers or other 
memory. 

The I/O 111 provides two-way data communication cou- 
pling to the network 112. The I/O may be a digital subscriber 
line (DSL) card or modem, an integrated services digital 
network (ISDN) card, a cable modem, a telephone modem, 
a cable, a wire, or wireless link to send and receive electrical, 
electromagnetic, or optical signals that carry digital data 40 
streams representing various types of information, including 
instruction sequences. The communication with peripherals 
may include, for example, a Universal Serial Bus (USB) or 
a PCMCIA (Personal Computer Memory Card International 
Association) interface. 

Various forms of computer-readable media may carry 
emulation code to transform a general purpose computer 
into a special purpose computer that will thereby include the 
emulator of the present invention. For example, the emula- 50 
tion instructions for carrying out at least part of the present 
invention may initially be on RAM 104, ROM 113, mag- 
netic disk 107, optical disc 107, flash memory 107, cache 
105 or the like computer-readable media of a storage 104 
locally associated with the processor 103 or to be transmit- 
ted to a remote computer 300. The invention includes 
emulation instructions on a computer readable medium and 
as a data stream signal. . 

With reference to FIG. 1, an input instruction 205 is, by 60 
way of example, a BT instruction, which is a conditional 
branch to a target address (computed using a personal 
computer or PC relative offset, #disp), if the branch condi- 
tion is true. The syntax for the BT instruction (an original 
instruction from an original sequence of instructions to be 
emulated) is: 



EXAMPLE 1 



LINE 


INSTR. 


OPERAND ETC. 


Line 1 


BT 


#disp 



In sequences of instructions to follow, the instructions are 
written top to bottom in the order in which such instructions 
are executed. A first column of line numbers is added only 
for reference purposes herein. A second column provides the 
instruction operation codes (OP CODE), which specifies the 
operation to be performed. A third column (OPERAND 
ETC.) specifies one or more operands or their addresses and 
one or more modifiers or their addresses (to modify the 
operand or its address). A fourth column (FLAG) specifies 
the likely flag content or condition code provided by the 
branch predictor 203. 

The emulated sequence of instructions obtained by emu- 
lation of the EXAMPLE 1, which emulation is according to 
the prior art, is: 



EXAMPLE 2 



LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 1 


PT 


#disp, TR0 


// likely 


Line 2 


NOP 






Line 3 


BNE 


R19, R63, TR0 


// likely 



35 



45 



55 



65 



NOP is a no operation instruction. BNE is a conditional 
branch instruction. TR0 is a temporary register used for the 
PT instruction that requests a prefetch. R19, R63 and TR0 
refer to specific registers and their contents, which may hold 
operands and modifiers used during the PT instruction 
operation or execution. Since the example original BT 
instruction has a provision to indicate if control is trans- 
ferred to a branch target as well as static branch prediction, 
the default branch prediction in the EXAMPLE 2 is that both 
the PT instruction and the BNE instruction are considered 
"likely" to branch. 

This invention includes the identification and analysis of 
problems and their causes of statically determined emulation 
sequences of the prior art, some of which problems and 
causes, as illustrated by examples, are: 

Problem/Cause 1. Since the "likely" flag is always 
asserted for the PT instruction, it is possible that the branch 
target was fetched unnecessarily, which wastes memory 
bandwidth. 

Problem/Cause 2. Similarly since the BNE instruction is 
always considered "likely" for all branches (including those 
known to be rarely taken), there is a substantial chance of 
misprediction, which wastes execution cycles. 

Problem/Cause 3. TR0 is the temporary register for all BT 
instructions. A temporary register is a memory, such as a 
cache, that is used by a program or operating system to hold 
work in progress temporarily. The temporary register is 
needed only until the current session is terminated, at which 
time the contents may be saved in another storage or may be 
discarded. When there are two BT instructions in a loop or 
recursive call, which is very common, temporary register 
TR0 is used repeatedly, that is the same register is used for 
successive BT sessions. In such a case, a target instruction 
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in the buffer of temporary register TOO for one branch (the tion (that is, dynamic execution information) provided at 
first executed BT instruction) is likely to be replaced by a inputs, from the resource usage 201 (a known component 
different target instruction of another branch (the second BT that may be a resource file that includes a resource map that 
instruction). In a loop and a recursive execution, this indexes resource data, structures, templates, definition pro- 
replacement takes place repeatedly, using execution cycles 5 cedures, renaming procedures, management routines, icon 
for the replacements in the iterations and invocations. The maps and so forth associated with a particular resource, such 
inventor has determined that the elimination of the thus as a menu, window, or dialog box, and in addition a new 
identified cause, that is the replacements, will eliminate the component that has new resources such as the historical 
corresponding wasted execution (machine) cycles and there- register usage table and the historical branch prediction 
fore speed up the overall execution of the emulated BT 10 table, which will be described later with respect to the 
instruction. enhanced or improved embodiment) and the branch predic- 
Problem/Cause 4. The reuse of the temporary register tor 203 (a known component that will perform branch 
TOO causes difficulty in superscalar processors wherein the prediction and which is preferably hardware, and in addition 
processor superscalar architecture enables multiple instruc* one having new components such as the instruction group 
tions to be executed simultaneously for each clock cycle. 15 branch prediction, which will be described later with respect 
Superscalar or substantially simultaneous execution of two to the enhanced or improved embodiment), respectively. The 
BT instructions is difficult and uses machine cycles to keep components 201 and 203 are examples of components that 
track of whether the current content of temporary register may provide dynamic execution information as inputs to the 
TOO is applicable to the first or second BT instruction. When emulation sequence generator 202. 
the content of temporary register TOO is needed in one 20 The emulation sequence generator 202 may internally 
machine cycle for execution of both BT instructions, it generate dynamic execution information through virtual 
would appear impossible to prevent an error according to the execution or actual execution. For an example input instruc- 
prior art. The reuse of temporary registers causes false tion 205, the instruction address of the BT instruction is used 
dependencies, requiring these registers to be renamed in in a branch predictor table of the branch predictor 203. The 
order to relate the content to a particular code execution. 25 branch predictor 203 predicts whether the branch is to be 
This may lead to sub-optimal use of the renaming resources. taken or not according to known technology and issues a 
In systems that do not support hardware register renaming, "likely" flag accordingly. The "likely" flag for the PT and 
such instructions with false dependencies may be stalled BNE instructions are provided as dynamic execution infor- 
unnecessarily. The inventor has determined that the elimi- mation by the branch predictor 203 to the emulation 
nation of the thus identified and analyzed cause, will elimi- 30 sequence generator 202. 

nate the corresponding execution (machine) cycles previ- The "likely" flag is generated by the branch predictor 203 
ously needed and therefore speed up the overall execution of of the preferred embodiment on run-time information and 
the instruction, eliminate stalls and eliminate errors. Also the therefore the prediction is not static or fixed, rather it is 
elimination of a need to use a renaming resource will speed dynamic, which may also be through known technology, 
up the operation and eliminate a need for a renaming 35 The invention is usable with dynamic execution information 
resource. other than branch prediction of the example. With static 
Problem/Cause 5. The reuse of the temporary register branch prediction, prediction accuracy is between 50% and 
TOO causes difficulty in out-of-order issue machines, where 90%. Dynamic branch predictors like that used in the 
instructions may be executed out-of-order to avoid unnec- embodiment are frequently well over 90% accurate. Any 
essary stalls. It is difficult to keep track of whether the 40 kind of branch predictor 203 may be used for branch 
content of temporary register TOO is applicable to the first or prediction as one form of dynamic execution information, 
second BT instruction for any one machine cycle, thus although a dynamic predictor is preferred. The design or 
requiring wasting machine cycles to keep track of register absence of the branch predictor 203 does not affect the broad 
usage. The reuse of temporary registers causes false depen- implementation of this invention that uses dynamic execu- 
dencies, requiring these registers to be renamed in order to 45 tion information in general to generate the instruction emu- 
relate the content to a particular code execution. This may lation sequence 204. As mentioned a new enhanced branch 
lead to sub-optimal use of the renaming resources. In predictor is described later. 

systems that do not support hardware register renaming, Solution 2. The new emulation of the embodiment keeps 

such instructions with false dependencies may be stalled track of temporary registers TOn and registers Rn, where n 

unnecessarily. The inventor has detennined that the elimi- 50 is a whole number 0, 1, 2, . . . , for example as a hardware 

nation of the thus identified and analyzed cause, will elimi- state machine. A new enhanced historical register usage 

nate the corresponding execution (machine) cycles previ- table as a usage resource is described later, 

ously needed and therefore speed up the overall execution of The emulation sequence generator 202 maintains a list of 

the instruction, to eliminate stalls and to eliminate errors. temporary registers TOn and registers Rn, used in the 

Also the elimination of a need to use a renaming resource 55 instruction emulation sequence 204. An example list of 

will speed up the operation and eliminate its requirement temporary registers TOn is caller-save registers as defined by 

This invention includes multi-part solutions to the above an ABI instruction or a subset thereof. The emulation 

identified problems, which solutions include: sequence generator 202 uses heuristics (approaches or algo- 

Solution 1. The emulation of the embodiment is enhanced rithms to find a correct solution of a prograrnming task, non 

with dynamic, run-time information. 60 rigorous or self-learning) that determines which temporary 

In FIG. 1, the emulation system has a new emulation register TOn or register Rn should be used by the emulation 

sequence generator (in the example, implemented as hard- sequence generator 202 so as not to overwrite valid data 

ware) 202 that receives an instruction 205, for example a BT already in a register. 

instruction, as an input. The emulation sequence generator The embodiment determines execution information as to 

202 is a state machine that generates the appropriate emu- 65 which of the registers TOn, Rn have contents that may be 

lation sequence 204 at its output. The emulation sequence used during the forthcoming execution of the emulated 

generator 202 is enhanced with dynamic, run-time informa- sequence of instructions (valid contents) and which of the 
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registers TRn, Rn have contents that are not to be used 
during the forthcoming execution of the emulation sequence 
(don't care contents); the former are not rewritten and the 
later may be rewritten as needed. The naming or renaming 
of temporary registers and/or order of instructions in the 
instruction emulation sequence is dynamically changed 
according to the execution information to avoid register 
conflict. 

Solution 3. The new emulation of the embodiment, keeps 
a record of branches and provides branch prediction codes or 
flags for each branch, as is known. A new enhanced histori- 
cal branch prediction table as a dynamic execution infor- 
mation usage resource 201 or a state machine output of the 
branch predictor 203 is described later in more detail, but in 15 
general scripts or heuristics consider the branch predictions 
of plural branches together as a group to determine a new 
and additional branch prediction of the group, which new 
and additional branch prediction may be different from any 
members of the group. 

As an example, consider the following original sequence 
of instructions 209: 



10 



-continued 





LINE 


OP CODE 


OPERAND ETC. 


FLAG 


5 


Line 5 


PT.NT 


#disp2, TR1 


// unlikely 




Line 6 


BNE.NT 


R19, R63, TR1 


// unlikely 



20 



EXAMPLE 3 



25 



LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 1 


CMPEQ 


Rl, R3 




Line 2 


BT 


#displ 


// likely 


Line 3 


CMPEQ 


R4, R6 




Line 4 


BT 


#disp2 


// unlikely 



30 



CMPEQ is a compare instruction. The emulated sequence 35 
of instructions that would be produced from the EXAMPLE 
3 by a prior art emulation sequence generator is: 



EXAMPLE 4 



40 



LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 1 


CMPEQ 


Rl, R3, R19 




Line 2 


PT 


#displ, TR0 


// likely 


Line 3 


BNE 


R19, R63, TR0 


// likely 


Line 4 


CMPEQ 


R4, R6, R19 


Line 5 


PT 


#disp2, TR0 


// likely 


Line 6 


BNE 


R19, R63, TR0 


// likely 



45 



50 



With the embodiment of the invention, the modified 
emulated sequence of instructions 204 produced from the 
EXAMPLE 4 prior art emulated sequence of instructions 
and produced by the emulation sequence generator 202 is: 55 

EXAMPLE 5 



60 



LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 1 


CMPEQ 


Rl, R3, R20 




Line 2 


PT 


#displ, TR0 


// likely 


Line 3 


BNE 


R20, R63, TR0 


// likely 


Line 4 


CMPEQ 


R4, R6, R19 



65 



In the EXAMPLE 5 modified emulated sequence of 
instructions, the ".NT' is used to distinguish a second 
instruction, for example in line 5, from an identical OP 
CODE, in the example found in line 2. The branch predictor 
203 provides even more sophisticated dynamic information 
than previously discussed, namely an additional flag condi- 
tion of "unlikely", which improves run-time performance. 

There are three important dynamic changes to the 
EXAMPLE 4 in producing the EXAMPLE 5, namely: 

Change 1) the target register specified is different in lines 
5 and 6, that is, the temporary register TOO has been changed 
to temporary register TR1 in lines 5 and 6. 

Change 2) the "likely" flag condition specified is different 
in lines 5 and 6, that is, for the two branch instructions in 
lines 5 and 6 the "likely" flag conditions have been changed 
to "unlikely". 

Change 3) more than one condition code is created, 
namely "likely" and "unlikely". 

The PT instruction requests a prefetch and the target 
instructions are brought into special buffers attached to each 
target temporary register TRn. In the EXAMPLE 4, the 
target instructions of the first branches of lines 2 and 3 are 
displaced by the target instructions of the branches of lines 
5 and 6. When the EXAMPLE 4 is in a loop or a recursive 
call, there are many unnecessary or redundant instruction 
fetches. In the EXAMPLE 5, the target temporary register 
TR1 used in lines 5 and 6 is different from the target 
temporary register TOO used in lines 2 and 3 and there is no 
clash. Therefore, with the embodiment, in accordance with 
the solution 2) above, upon an iteration of the sequence in 
the loop or upon a recursive invocation of the sequence, 
machine cycles to rewrite the temporary register TOO twice 
per iteration or invocation as in the prior art EXAMPLE 4 
are not necessary, because of the provision of multiple 
temporary registers TOO and TO1 whose contents do not 
change during the iterations of the loop or the recursive 
invocations of the call. Therefore machine cycles are saved 
and the sequence executes faster with the embodiment than 
it does with the prior art. 

Many computer architectures specify condition codes. 
These codes are typically used to describe status or condi- 
tions such as branch taken/not taken, overflow, carry, and 
negative. 

As an example, a condition code may denote whether the 
result was negative of whether a branch should be taken. A 
single branch condition code bit is implicitly updated by all 
compare instructions (CMPEQ) and implicidy used by all 
conditional branch instructions, for example BNE.NT. This 
single resource, the branch condition code bit, is cause of a 
potential bottleneck problem. In the EXAMPLE 4 of prior 
art emulation, the first compare instruction CMPEQ of line 
1 sets R19, the value of which is then used by the first branch 
BNE of line 3; the second compare-branch pair of lines 4 
and 6, respectively, uses the same R19. In the EXAMPLE 5 
of the embodiment, the first compare CMPEQ of line 1 sets 
R20, the value of which is then used by the first branch BNE 
of line 3; the second compare-branch pair of lines 4 and 6 
uses R19. In effect, the instruction emulation sequence of the 
embodiment has created two condition codes even though 
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the original architecture of the input instruction 205 and the 
second prior art instruction emulation sequence define only 
one. This means that with the EXAMPLE 5 of the embodi- 
ment, the second compare instruction can be moved ahead 
of the first branch safely as it does not destroy the register 5 
contents (in this case, R20) used by the branch. 

As another example using an out-of-order issue machine, 
including a processor of instructions, instructions are often 
issued and executed out-of-order to allow independent 1Q 
instructions to go ahead of other stalled instructions. The out 
of order execution of instructions is possible only when 
there is no true data dependency between the instructions 
whose order is to be changed, that is when the instructions 
whose order is to be changed are independent instructions. 15 

In EXAMPLE 4 of the prior art emulation, the second PT 
instruction of line 5 targets temporary register TRO. If this 
second PT instruction of line 5 were moved ahead, it is 
desirable that the first BNE instruction of line 3, which uses 
temporary register TRO, should not be affected. An out-of- 20 
order issue machine recognizes this data dependency of the 
second PT instruction of line 5 and the first BNE instruction 
of line 3 using the same target temporary register TRO and 
the out-of-order issue machine might rename temporary ^ 
register TRO of line 5, if a renaming resource is available. If 
no renaming resource is available, the second PT instruction 
of line 5 is stalled to avoid a Write-After-Read (WAR) 
hazard. 

In EXAMPLE 5 of the embodiment modified emulated 30 
sequence, the second PT instruction of line 5 targets tem- 
porary register TR1. Therefore, the second PT instruction of 
line 5 can be moved upward in the sequence across a much 
greater number of instructions than is the case with 
EXAMPLE 4, without the possibility of WAR, and then 35 
there is no renaming of registers necessary during running of 
the modified emulated sequence of instructions. Therefore 
the potential problem of stall of a WAR hazard is solved in 
the emulation sequence generator; and there is no require- 
ment of a subsequent prior art solution after emulation and 
during the execution of a prior art emulated sequence of 
instructions. Such is another advantage of the present 
embodiment. 

In the specific computer instruction set of the examples, 45 
there is a two-cycle latency between a PT instruction and a 
dependent branch, to be referred to as a PT-BR stall. The use 
of the target temporary register TRO in line 2 of the 
EXAMPLE 5 of an embodiment modified emulated 
sequence, which is different from the target temporary 50 
register TR1 used in the second PT instruction (PT.NT) 
found in line 5, allows the second PT.NT instruction to be 
moved ahead, which thereby avoids the above-mentioned 
PT-BR stall and can run as an out-of-order sequence while 
avoiding unnecessary stalls and wasted cycles. 55 

When the branch predictor 203 provides input to the 
emulation sequence generator 202 with specific dynamic 
execution information of the "likely" flag, the emulation 
sequence generator 202 modifies the emulation as needed to 50 
avoid the problems mentioned herein, to generate the modi- 
fied emulated sequence of instructions 204 accordingly. In 
the EXAMPLE 5 of the embodiment, the first PT and BNE 
instructions of lines 1 and 4 respectively are marked "likely" 
while the second pair of PT.NT and BNE.NT instructions of 65 
lines 2 and 6, respectively, are marked <4 unlikely". This 
additional flag value, namely, "unlikely", improves run-time 
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performance, particularly if a group of branches are consid- 
ered together and two (or some number greater than two) or 
more "unlikely" flags of the group provide a new additional 
condition flag of the group as "likely". With even more 
sophisticated dynamic information available, the second 
PT.NT instruction of line 2, is flagged as "likely", while the 
second BNE.NT instruction of line 6 is flagged as 
"unlikely". The above-mention U.S. patent provides a 
means for providing multiple condition flags, which may be 
evaluated for a group of branches according to the present 
embodiment. This would prefetch the target instructions of 
the second PT.NT instruction; however, the branch is pre- 
dicted to be "unlikely". This is useful to prefetch target 
instructions of those branches that are usually unlikely, but 
are guaranteed to be taken from time to time, and which if 
taken together with another "unlikely" branch as a group, the 
group has a higher likelihood than any member of the group 
so that the group as a whole becomes "likely". 

Conventional emulation sequence generation can have 
undesirable side effects like polluting data buffers. For 
example, when a data buffer is rewritten due to emulation, 
the data buffer is polluted if the thus destroyed valid value 
is needed in a subsequent instruction, whose execution will 
later use a rewritten invalid value from the data buffer due 
to the pollution. 

Overflow occurs when data resulting from executing a 
sequence of instructions requires more bits than have been 
provided in hardware or software to store the data. Examples 
of overflow involve floating-point operations where the 
result is too large for the number of bits allowed for the 
exponent, a string that exceeds the bounds of the array 
allocated for it, and an integer operation whose result 
contains too many bits for the register into which it is to be 
stored. In general, overflow occurs when a number resulting 
from some arithmetic operation is too large to be contained 
in the data structure that a program provides for it. Under 
such conditions, it is common in the prior art to have a usage 
resource produce dynamic execution information, such as 
setting an overflow error flag. 

In the original sequence of instructions, there may be no 
problem of an overflow error. The inventor has determined 
that the emulated sequence of instructions as produced by 
the prior art may have an overflow problem that could not 
have been anticipated by the original programmer. The 
instruction emulation sequence of an example original 
instruction, ADDC, executable with a first instruction set, is: 

EXAMPLE 6 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


line 1 


ADDC 


Rl, R2 


Rl + R2 + CARRY 



The above code involves the use of a hidden resource, a 
branch condition code bit, which by convention is stored in 
R29. Upon overflow, the branch condition code bit in 
register R29 is upgraded. An example of an emulated 
sequence of instructions executable in a second instruction 
set, obtained by translating EXAMPLE 6, according to the 
prior art, is as follows: 



. Vj I 
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EXAMPLE 7 
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LINE 



OP CODE OPERAND ETC. COMMENT 



Line 1 



ADDC 



Rl, R2, R3 



Rl + R2 + CARRY 



14 



The modification allocates a currently unused register, for 
example R20 from the pool of registers, to substitute for R3 
and obtain the modified emulated sequence of instructions of 
EXAMPLE 1 1 , according to the embodiment. 

EXAMPLE 11 



The above code uses three explicit registers and one 
implicit register R29. When there is an overflow, register R3 
is corrupted if it contained valid data prior to the overflow. 
The setting of the overflow flag is dynamic execution 
information that is responded to by using the dynamic 
execution information of the historical record of register 
usage to modify the emulated sequence of instructions, 
which include the instructions of EXAMPLE 7, by allocat- 
ing a currendy unused register, for example R20 from the 
pool of don't care data registers, to substitute for R3 and 
obtain the modified emulated sequence of instructions of 
EXAMPLE 8 according to the embodiment 

EXAMPLE 8 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


Line 1 


ADDC 


Rl, R2, R20 


Rl + R2 + CARRY 



The following original instruction of EXAMPLE 9, 
executable with a first instruction set, has a special load 
instruction that increments the loaded value. The prior art 
emulated instructions for a different instruction set is in 
EXAMPLE 10. The emulated instructions, according to the 
prior art, have more lines and typically uses fixed registers. 

EXAMPLE 9 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


Line 1 


LD 


Rb, Rt 


Load 


Line 2 


ADD 


Rb, Rt, R20 


Rb + 1, Rt+1 



15 

The prior art fixed nature of a hardware translation to an 
emulated sequence of instructions causes an uneven utiliza- 
tion of machine resources. For example, when one register 
is repeatedly used for temporary values due to emulation, the 

20 instructions that depend upon those values must be executed 
before the register value is changed and thus the use of 
out-of-order machines and parallel or multi processing is 
limited. Stall or WAR hazard may occur. Resolving such 
dependencies according to the prior art can potentially waste 

25 hardware resources. For example such dependencies of the 
emulated sequence may require that multiprocessing, out- 
of-order processing, etc. not be used. 

When a division (DIV) and an add (ADD) that both use 
R3 follow each other in an original sequence of instructions 
30 in a first instruction set, as in EXAMPLE 12, there is no 
problem if the instructions are executed in parallel. 

EXAMPLE 12 

35 



LINE OPCODE OPERAND ETC. COMMENT 

Line 1 DIV Rl, R2, R3 result in R3 

40 line 2 ADD Rl, R2 R1+ R2, result in R2 



line opcode operand etc. comment However, the emulated sequence of instructions to run 

linei LDm Rb, Rt Load and post add l to address wim a second mstmction set, according to the prior art, uses 

45 the register R3 for both the add result and the division result, 

as in EXAMPLE 13. 

EXAMPLE 10 EXAMPLE 13 



50 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


Line I 


LD 


Rb, Rt 


Load 


Line 2 


ADD 


Rb, Rt, R3 


Rb + 1, Rt + 1 



The above prior art emulated code of EXAMPLE 10 uses 
three registers, including a new temporary register R3. With 
the code of EXAMPLE 10, register R3 is corrupted if it 
contained valid data prior to the ADD. The event of an 60 
additional register being used for the emulated sequence of 
instructions as compared to the registers used by the original 
sequence of instructions is dynamic execution information. 
This dynamic execution information is responded to by 
using the dynamic execution information of the historical 65 
record of register usage to modify the emulated sequence of 
instructions that include the instructions of EXAMPLE 10. 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


Line 1 


DIV 


R1,R2,R3 


result in R3 


Line 2 


ADD 


Rl, R2, R3 


Rl + R2, result in R3 



The ADD instruction uses R3 as a temporary register. 
However, since the proceeding DIV is likely to take more 
machine cycles than the ADD, the value left over from the 
division may finally get stored incorrectly in R3. If, instead, 
the ADD used a different temporary register, R20 in this 
case, then the value in R3 after the execution of both 
instructions is the result of the ADD, which is the correct 
value, as in EXAMPLE 13. The embodiment of the present 
invention generates the modified sequence of instructions of 
EXAMPLE 14 and the problem is solved. 
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EXAMPLE 14 



LINE 


OP CODE 


OPERAND ETC. 


COMMENT 


Line 1 


DIV 


Rl, R2, R3 


result in R3 


Line 2 


ADDC 


Rl, R2, R20 


Rl + R2 + result in R20 



When the embodiment of the present invention generates 
the emulated sequence of instructions of EXAMPLE 13, 
with the emulation sequence generator 202, the addition of 
a new register to the code in emulating line 1 is noted as 
dynamic execution information by the resource usage 201. 
The resource usage 201 as a part of the emulator compares 
the registers of the original sequence of instructions with the 
registers used by the emulated sequence of instructions. This 
resource usage is preferably a historical state machine of 
register usage that keeps a complete history of registers 
used, for example a separately stored record for each register 
use, each record including: register identification, for 
example, R3; the value in the register as a result of the use; 
a valid flag for such values, which would indicate valid 
when the value is stored and canceled when the value is last 
and later used; and possibly the line number and instruction 
where used in the sequence. This historical record alterna- 
tively is kept in a look-up table by software implementation 
as a usage resource 201. 

All registers that do not have their valid flag set, that is 
that have don't care data, constitute a pool of registers 
available for rewriting. Some registers may be permanently 
removed from the pool and the record keeping. The histori- 
cal state machine of register usage therefore contains 
dynamic execution information, which is responded to by 
using the dynamic execution information of the historical 
record of register usage to modify the emulated sequence of 
instructions. 

The modification of the embodiment involves allocating a 
currently unused register, for example R20 from the pool of 
available registers (those not having a valid flag indicating 
valid data, that is, don't care data registers), to substitute for 
the register R3. Thus R20 is newly added to the prior art 
emulated sequence of instructions of EXAMPLE 13, and 
thereby the embodiment obtains the modified emulated 
sequence of instructions of EXAMPLE 14. 

A branch prediction guesses whether a branch will be 
taken in a program and fetches code accordingly. When a 
branch is considered as "likely", the next instruction of the 
branch sequence is stored in fast memory, such as a cache, 
and the "next instruction" is therefore ready to be removed 
from such fast storage to be used the next time a branch, 
which may or may not be the same branch, is encountered. 
Thereby there is a prediction as to which way the instruction 
will branch, which prediction may be correct about 90% of 
the time. Dynamic branch prediction information is obtained 
from the branch predictor 203, which according to one 
embodiment is a branch prediction state machine. 

The improved or enhanced branch prediction state 
rnachine of the embodiment keeps an historical record of 
each of all branches at particular addresses and whether or 
not each of the branches was taken. This historical record 
alternatively is kept in a look-up table by software imple- 
mentation as a usage resource 201. The emulated sequence 
of instructions may have a plurality of branches (by adding 
branches during emulation or because the original sequence 
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of instructions had a plurality of branches), each having a 
branch prediction of Unlikely". 

For example, the percentage of time each branch is taken 
may be 40% and accordingly the table sets a branch pre- 

5 diction of "unlikely" for each of the branches. However the 
embodiment emulator, because of the novel historical record 
of dynamic branch prediction execution information, deter- 
mines that the branches of the function as a group. As a 

10 specific example, they may function as a group because they 
branch on the same error condition, so that as a group the 
branch prediction is likely. Therefore, the embodiment 
modifies the computer system by changing the branch 
predictor or by modifying the emulated sequence of instruc- 

15 tions to provide a new branch prediction for the group, 
which prediction is then "likely". For example two branches, 
adjacent or separated by instructions, have a likelihood of 
each branch that is 40% and therefore in the prior art is each 
branch is labeled with a condition code of 4t unlikely" or 

20 merely not flagged as "likely". However, when each of the 
two branches are controlled by the same condition or each 
branches to the same address, for example, then the likeli- 
hood of branching, for the group of two branches, becomes 
96%, which group likelihood is recognized by the embodi- 

25 ment, and then the embodiment modifies the instructions so 
that each of the branches or one of them has its branch 
prediction condition code or flag changed to "likely". 

The change in condition code or flag as provided by the 
embodiment is made possible by the generation of the novel 

30 historical record of dynamic branch prediction execution 
information. The historical record of dynamic branch pre- 
diction execution information preferably includes a record 
for each branch, with each record comprising: the branch op 
code line number, flag or multi-condition code, condition or 

35 event implementing the branch and target of the branch, for 
example. The following is a specific example of use of the 
historical record of dynamic branch prediction execution 
information. 

The original sequence of instructions includes the follow- 
40 ing branch instruction of EXAMPLE 15, which was 
explained previously with respect to EXAMPLE 1 . 

EXAMPLE 15 

45 



50 



LINE 


INSTR. 


OPERAND ETC 


FLAG 


Line 1 


BT 


#disp 


//unlikely 


Line 2 


BT 


#disp 


//unlikely 



In the EXAMPLE 15, the FLAG specifies the likely flag 
content provided by the prior art branch predictor. The 
55 emulated sequence of instructions obtained from the 
EXAMPLE 15, according to the prior art, is: 

EXAMPLE 16 

60 



LINE OP CODE OPERAND ETC. FLAG 

Linel FT #displ, TR0 // unlikely 

line 2 NOP 

Line 3 BNE R19, R63, TR0 // unlikely 
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LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 4 


PT 


#disp2, TRO 


// unlikely 


Line 5 


NOP 






Line 6 


BNE 


R19, R63, TR1 


// unlikely 



With the embodiment of the invention, the modified 
emulated sequence of instructions 204 produced from the 10 
EXAMPLE 16 by the emulator is: 



EXAMPLE 17 



15 



LINE 


OP CODE 


OPERAND ETC. 


FLAG 


Line 1 


PT 


#displ, TRO 


// likely 


Line 2 


NOP 






Line 3 


BNE 


R19, R63, TRO 


// likely 


Line 4 


PT 


#disp2, TR1 


// likely 


Line 5 


NOP 






Line 6 


BNE 


R20, R63, TR1 


// likely 
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FIG. 3 is a flow chart of the method of operation of the 
emulator, applicable to hardware and/or software and/or 
firmware implementation. 

Step 300 inputs the next instruction in a sequence of 
instructions in a first instruction set, which in the example 30 
was the BT instruction with parameters of a target address 
in a PC-relative offset "#disp" and temporary register TRO. 
This BT input is provided at the input 205 of FIG. 1, which 
could originate from the devices of FIG. 2, for example at 
the input 110, at the keyboard 109, as a selection choice 35 
displayed on the monitor 108, from cache 105, from ROM 
113, from storage 107, from I/O 111, from network 112 in 
general, or from another computer 300. This instruction, BT, 
is passed to step 301. 

Step 301 generates, in a known manner, an instruction 40 
emulation sequence in a second instruction set, which sec- 
ond instruction set is different from the first instruction set 
from step 300. This instruction emulation sequence in a 
second instruction set is generated by the emulator 201, 202 
and 203 of FIG. 1, the emulator 106 of FIG. 2. 45 

Decision step 302 analyzes the instruction emulation 
sequence to determine if the parameters have a problem that 
is evident from the instruction emulation sequence itself or 
from the dynamic execution information, for example by 
maintaining information as to the use of temporary registers 50 
to see if a register with valid data is being rewritten. For a 
more specific example, the system determines which of the 
temporary registers TRn have contents that may be used 
during the forthcoming execution of the emulation sequence 
(valid contents) and which of the temporary registers TRn 55 
have contents that are not to be used during forthcoming 
execution of the emulation sequence (don't care contents), 
so that the former should not be rewritten and the later may 
be rewritten as further temporary registers TRn are needed. 
The emulation sequence generator 202, maintains a list of 60 
temporary registers TRn, used as scratch registers in the 
instruction emulation sequence 204. An example list of 
temporary registers TRn is caller-save registers as defined by 
an ABI instruction or a subset thereof. The emulation 
sequence generator 202 uses heuristics that determines 65 
which scratch register TRO is used by the emulation 
sequence generator 202. When the decision is that the 



current parameters have no problem, operation proceeds to 
step 304 for execution of the instruction emulation sequence 
as a whole or as one instruction at a time. When the decision 
is that the current parameters have a problem, for example 
one register in a loop is being used for first and second data 
with each iteration, operation proceeds to step 303 to solve 
the problem. 

Step 303 solves the problem found in step 302, for 
example adds a new register so that the first and second data 
may be in respective registers for each iteration of a loop, to 
save machine cycles and avoid register conflicts. Also, the 
problem may be solved by the naming or renaming of 
temporary registers to avoid register conflict. 

The execution, in step 304, generates execution informa- 
tion, such as the likelihood of a branch being taken or idle 
machine cycles waiting for a result before the next instruc- 
tion may be executed. The execution information is passed 
to step 305. 

Step 305 determines if the execution information identi- 
fies idle cycles and through heuristics determines if the 
sequence may be changed to move an instruction not need- 
ing the results being waited for to a position to precede the 
instruction waiting to be executed. Thereby, the moved 
instruction may use the previous idle time of the processor 
and thus save machine cycles. When the decision is that 
there is a problem of this type, operation proceeds to step 
306 to solve the problem. When the decision is that there is 
no problem of this type, operation proceeds to step 307. 

Step 306 is reached when the answer to the inquires of 
step 302 is yes and then one or more instructions of the 
emulation sequence are moved to solve the problem, for 
example to utilize the formerly idle machine cycles of the 
processor. Operation then proceeds to step 307. 

Step 307 determines if there are further parameter prob- 
lems as at least partially identified using the execution 
information from step 304, and when the answer is yes, 
operation proceeds to step 308 to solve the problems, and if 
the answer is no, operation proceeds to step 309. 

Step 308 solves the problems identified in step 307 by, for 
example, changing parameters such as registers or changing 
execution information such as adding an unlikely condition 
to a flag. These and other examples are more fully set forth 
elsewhere in this specification. Next, operation is passed to 
step 309. 

Step 309 determines if there are any more input instruc- 
tions from the first instruction set that are to be emulated, 
and when the answer is yes, then the control returns to step 
300 to input another instruction to be emulated. When the 
decision reaches a no answer, the emulation is finished and 
the emulation process ends. 

FIG. 3 is an embodiment for illustration of the invention, 
and the order of step groupings and individual steps may be 
changed. Also, some of the decision and problem solving 
pairs may operate in a loop to go through all or many of the 
instruction emulation sequence of the instructions that have 
been emulated to date, instead of just those resulting from 
the emulation of a single instruction from the first instruction 
set. Further steps 300 and 301 could be completed for the 
entire sequence of instructions in the first instruction set with 
the results stored, etc. before starting the steps 302 to 309, 
the latter step 309 returning to step 302, and in such case the 
emulation of the present invention would be more specifi- 
cally referred to as an emulation extension or enhancement. 

FIG. 4 is a flow chart of the method of operation of 
another portion of the emulator embodiment, applicable to 
hardware and/or software and/or firmware implementation. 
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Step 400 inputs the next instruction in an original 
sequence of instructions that are executable with a first 
instruction set. 

Next the instruction input in step 400 is executed in step 
402 and dynamic execution information as to each register 
usage is stored as a separate record in an historical register 
usage table, comprising fields of: register identity (both for 
the original and emulated sequences); value stored in that 
register; identity of the instruction causing the storing; flag 



A comparison of the historical register usage dynamic 
execution information for original instructions and emulated 
instructions easily determines and identifies problems to be 
solved. For example, when new registers have been added 
by emulation and if at that time the data that was in the new 
register was still valid (from when the original sequence of 
instructions was run), then there is register pollution. The 
valid data flag will identify registers that have invalid or 
don't care data and such registers form a pool of registers 



as to whether the instruction is an emulated instruction or not 10 that may be used in the process of FIG. 3, for allocating new 

(alternatively, separate tables are provided for the original registers when emulation is again performed for the original 

and emulated sequences of instructions); and a valid data sequence of instructions. The historical register usage 

flag. dynamic execution information in the tables as generated by 

Step 403 determines if there are any more instructions to ft e method of FIG. 4 is particularly useful with respect to the 

be input from the original sequence of instructions, and ^ following examples, as well with respect to an improved 

when there are, then the control returns to step 400. When usage resource for the previous examples, 

the decision 403 determines that there are no more instruc- with *e present invention, wherein the instruction 

tions to be input from the original sequence of instructions parameters are changed dynamically to make some instruc- 

control is passed to step 404. ti° ns independent in position from other instructions within 

Step 404 returns to the start of the original sequence of 20 me instruction emulation sequence, for example in the 



instructions and the method proceeds to step 405. 

Step 405 inputs the next instruction in the original 
sequence of instructions that are executable with a first 
instruction set. 

Step 406 generates, in a known manner, an instruction 
emulation sequence that is executable with a second instruc- 
tion set, which second instruction set is different from the 
first instruction set, by using the instruction input from step 
405. This emulated sequence of instructions is generated by 
the emulation sequence generator 202 of FIG. 1, the emu- 
lator 106 of FIG. 2. 

Next, in step 407, the emulated instruction is executed and 
dynamic execution information as to each register usage is 



EXAMPLE 5, the instruction emulation sequence may be 
executed substantially simultaneously with its generation or 
on a substantially real time basis. Therefore it is seen that the 
execution of instructions may be in a different order than 
25 they would be executed with the prior art emulation where 
emulation does not dynamically change parameters and 
permits position change of instructions by the processor 
during execution. 
The embodiments may be used when a program is emu- 
30 lated and compiled, or re-emulated and recompiled or as a 
part of run-time optimization. 

This invention is useful in efficiently emulating instruc- 
tions by dynamically modifying a component of the com- 
puter system, for example the parameters of the instruction 

35 emulation sequence, particularly at run-time. This invention 
table, comprising fields of: register identity fljoth for the fa particularly useful m combination with processors (such 
original and emulated sequences); value stored in that reg- „ microprocessors, sca lar processors and out-of-order pro- 
ister; identity of the instruction causing the stonng; flag as cessms) ^ meir tems mat emulate one Qr more 

to whether the instruction is an emulated mstrucuon or not unimplem ented instructions, in hardware or software. Such 
(alternatively, separate tables are provided for the original w uses mdude compatibility instruction emulation as well as 
and emulated sequences of instructions); and a valid data emulation to support programs written for other target 

hardware (like Java byte codes). Hardware and firmware 

Step 407 further includes the generation of the historical implementation is particularly advantageous, because the 

record of dynamic branch prediction execution information, cost is mcx jest, the speed is improved and there would be no 

which preferably includes a record for each branch, with 45 or software cost. 

each record comprising: the branch op code line number, ^ pre sent invention has been described in con- 
flag or multi-condition code, condition or event implement- nection with a number 0 f embodiments, implementations, 
ing the branch and target of the branch, for example. modifications and variations that have advantages specific to 
Step 408 determines if there are any more instructions to them, the present invention is not so limited but covers 
be input from the emulated sequence of instructions, and 50 various obvious modifications and equivalent arrangements 
when there are, then the control returns to step 405 to input according to the broader aspects, which fall within the spirit 
another instruction. When the decision 408 determines that and scope of the following claims, 
there are no more instructions to be input from the original What is claimed is: 

sequence of instructions, control is passed to step 409. i a method performed by a computer system having a 

Step 409 returns to the start of the original sequence of 55 hardware emulation, said method comprising the steps of: 



instructions and the method proceeds to step 300 of FIG. 3 
for modification of the emulated sequence of instructions in 
response to the entire history of all register usage dynamic 
execution information that is stored in tables according to 
steps 402 and 407. 60 

A comparison of the historical record of dynamic branch 
prediction execution information for original instructions 
and emulated instructions for various combinations of 
branches into groups easily determines and identifies when 
a group of registers may be evaluated more likely than their 65 
members to provide a group branch prediction of likely for 
an improved performance purpose, as explained previously. 



obtaining a first set of one or more emulated instructions 
derived from an original set of one or more instructions 
using the hardware emulation; 

initiating execution of the first set of one or more emu- 
lated instructions; 

producing first dynamic execution information in 
response to executing the first set of one or more 
emulated instructions; and 

changing the hardware emulation dynamically for pro- 
ducing a second set of one or more emulated instruc- 
tions by modifying at least a parameter of one instruc- 
tion of the first set of one or more emulated instructions 
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in response to said first dynamic execution information, 
said step of changing, includes software producing 
multiple conditions codes that replace a single condi- 
tion code of the first dynamic execution information. 

2. The method of claim 1, wherein: 5 
said step of changing, includes modifying at least a 

register field of one instruction of the first set of one or 
more emulated instructions. 

3. The method of claim 1, wherein: 

said steps of executing, producing and changing are 10 
conducted recursively on at least some of successive 
segments of the first set of one or more emulated 
instructions. 

4. The method of claim 1, wherein: 

said step of producing, produces branch prediction infor- 15 
mation; and 

said step of changing, changes condition codes of the 
branch prediction information. 

5. The method of claim 1, wherein: 

said step of producing, produces a history of register 20 

allocation information; and 
said step of changing, changes register allocation. 

6. The method of claim 1, wherein: 

said step of producing, produces a history of branch 
prediction dynamic execution information; and 25 

said step of changing, generates a branch prediction 
likelihood code for a group of branches that may be 
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different from any branch prediction of the members of 
the group. 

7. The method of claim 1, wherein: 

said step of modifying at least a parameter of one instruc- 
tion of the first set of one or more emulated instruc- 
tions, includes modifying a plurality of parameters of 
some instructions of the first set of one or more 
emulated instructions. 

8. A method performed by a computer system, said 
method comprising the steps of: 

obtaining an emulated sequence of instructions derived 
from an original sequence of instructions; 

initiating execution of the emulated sequence of instruc- 
tions; 

producing first dynamic execution information in 
response to executing the emulated sequence of instruc- 
tions; and 

changing the computer system dynamically to produce 
different dynamic execution information in response to 
said first dynamic execution information; 

wherein said step of changing, includes software produc- 
ing multiple conditions codes that replace a single 
condition code of the first dynamic execution informa- 
tion. 



